attention_sac_cm: critic multi-type, policy independent. one for each agent, within type mean field, between type attention
attention_sac_cm_virtual: from attention_sac_cm, one virtual agent for each type
attention_sac_cm_cmf: from attention_sac_cm, actor input state, critic input state + mean field
attention_sac_cm_cmf_noAtt: from attention_sac_cm_cmf, within type mean field, between vanilla (input mean field of all networks into one network)



